Ancient Scripts, Modern AI: Bridging the Divide with Morphology-Aware Tokenization by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🔤Language Tokenizers
StringWa.rs on GPUs: Databases & Bioinformatics 🦠
ashvardanian.com·27m·
🚀Tokenizer Performance
Semantic Dictionary Encoding
falvotech.com·4h·
Discuss: Hacker News
🗂️Type Indexing
Learn How to Use Transformers with HuggingFace and SpaCy
towardsdatascience.com·5h
📊Pratt Parsers
LLM Rerankers for RAG: A Practical Guide
fin.ai·21h·
🪜Recursive Descent
[D] How to best fine-tune a T5 model for a Seq2Seq extraction task with a very small dataset?
reddit.com·4h·
📊Parse Tables
Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation
arxiv.org·15h
🧠Semantic Parsing
I Tested AI 'Humanizers' to See How Well They Actually Disguise AI Writing
lifehacker.com·1h
📚Factor
I built an LLM from Scratch in Rust (Just ndarray and rand)
github.com·1d·
🌱Minimal ML
Challenges You Will Face When Parsing PDFs with Python
theseattledataguy.com·4h·
Discuss: Hacker News
🚀Tokenizer Performance
Creativity Benchmark: A benchmark for marketing creativity for LLM models
arxiv.org·15h
🏁Language Benchmarks
Glypht
glypht.valadaptive.dev·3h
🔄Incremental Lexing
How to turn Claude Code into a domain specific coding agent
blog.langchain.com·3h·
Discuss: Hacker News
🎮Language Ergonomics
Building Writely with Kiro: From PRD to Browser Extension in Hours 🎉
youtu.be·8h·
Discuss: DEV
💬Interactive REPLs
Language Models Pack Billions of Concepts into 12,000 Dimensions
nickyoder.com·15h·
🌱Minimal ML
RustGPT: A pure-Rust transformer LLM built from scratch
dev.to·6h·
Discuss: DEV
🏗️Cranelift
AI Tokenization Services
dev.to·9h·
Discuss: DEV
🔍Tokenizers
Comprehensive LLM Evaluation: Metrics, Methods, and Use Case Considerations
nexla.com·22h·
Discuss: DEV
🌱Minimal ML
Extractly - Turn PDFs into Data
dev.to·12h·
Discuss: DEV
🔄Incremental Lexing
How to Access Qwen3-Next API for Free?
analyticsvidhya.com·1h
🚀Tokenizer Performance